A Critique and Improvement of an Evaluation Metric for Text Segmentation
نویسندگان
چکیده
The Pk evaluation metric, initially proposed by Beeferman et al. 1997, is becoming the standard measure for assessing text segmentation algorithms. However, a theoretical analysis of the metric finds several problems: the metric penalizes false negatives more heavily than false positives, over-penalizes near-misses, and is affected by variation in segment size distribution. We propose a simple modification to the Pk metric that remedies these problems. This new metric – called WindowDiff – moves a fixed-sized window across the text, and penalizes the algorithm whenever the number of boundaries within the window does not match the true number of boundaries for that window of text.
منابع مشابه
Evaluation of the Parameters Involved in the Iris Recognition System
Biometric recognition is an automatic identification method which is based on unique features or characteristics possessed by human beings and Iris recognition has proved itself as one of the most reliable biometric methods available owing to the accuracy provided by its unique epigenetic patterns. The main steps in any iris recognition system are image acquisition, iris segmentation, iris norm...
متن کاملAssessment of the Log-Euclidean Metric Performance in Diffusion Tensor Image Segmentation
Introduction: Appropriate definition of the distance measure between diffusion tensors has a deep impact on Diffusion Tensor Image (DTI) segmentation results. The geodesic metric is the best distance measure since it yields high-quality segmentation results. However, the important problem with the geodesic metric is a high computational cost of the algorithms based on it. The main goal of this ...
متن کاملA Modified Character Segmentation Algorithm for Farsi Printed Text Using Upper Contour Labelling
In this paper, a modified segmentation algorithm for printed Farsi words is presented. This algorithm is based on a previous work by Azmi that uses the conditional labeling of the upper contour to find the segmentation points. The main objective is to improve the segmentation results for low quality prints. To achieve this, various modifications on local baseline detection, contour labeling an...
متن کاملA Proposed Scheme for Performance Evaluation of Graphics/Text Separation Algorithms
We propose an objective, comprehensive, and complexity independent metric for performance evaluation of graphics/text separation (text segmentation) algorithms. The metric includes a positive set and a negative set of indices, at both the character and the character string (text) levels, and it evaluates the detection accuracy of the location, width, height, orientation, skew, string length, an...
متن کاملHigh Performance Implementation of Fuzzy C-Means and Watershed Algorithms for MRI Segmentation
Image segmentation is one of the most common steps in digital image processing. The area many image segmentation algorithms (e.g., thresholding, edge detection, and region growing) employed for classifying a digital image into different segments. In this connection, finding a suitable algorithm for medical image segmentation is a challenging task due to mainly the noise, low contrast, and steep...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Computational Linguistics
دوره 28 شماره
صفحات -
تاریخ انتشار 2002